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DETAILED ACTION 



Claim Rejections - 35 USC § 102 

1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 

2. Claims 1-2, 4-5, 7-11, 14,16, 18, 19 and 21 are rejected under 35 U.S.C. 102(a) 
as being anticipated by Li et al. (US 2003/0210886). 



Regarding claims 1 and 14, Li discloses [a] key-frame extraction system (A 
number of keyframes are allocated among the shots based on the importance value of 
each shot, Li, paragraph 8), comprising: 

a set of frame analyzers that each select a set of candidate key-frames from 
among a series of video frames in a video, each frame analyzers for detecting a 
meaningful content in the video ("Frame Importance Computation", Li paragraph 51); 

key-frame selector that arranges the candidate key-frames into a set of clusters 
("After decomposing the video sequence 20 into scenes 22, shots 24 and frames 26, 
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each component scene 22, shot 24 and frame 26 is assigned an importance value 
based on measurements which are explained in greater detail below", Li, paragraph 29, 
all frames are potential key frames, frames are clustered into scene, shot and frame 
groupings) and that selects one of the candidate key-frames from each cluster as a key- 
frame for the video in response to a relative importance of each candidate key-frame 
("After frame importance has been determined, the number of keyframes NSj assigned 
to each shot must be selected from all F frames in the shot", Li, paragraph 65). 

Regarding claims 2 and 19, Li discloses [t]he key-frame extraction system of 
claim 14, wherein the frame analyzers include a camera motion tracker ("Camera 
motion may be detected by analyzing the layout pattern of the extracted motion vectors 
mv", Li, Paragraph 42 and figure 3). 

Regarding claim 4, Li discloses wherein the step of selecting a set of candidate 
key-frames includes the step of selecting a set of candidate key-frames in response to a 
fast camera movement in the video ("The last two factors are used to make sure that 
the extracted keyframe is a well-focused clear image and not a blurry image, such as is 
caused by a fast camera motion, fast object movement or a bad camera focus. For 
example, the still image taken after a camera panning is preferred over the image taken 
during the panning which may be blurred or unstable", Li paragraph 52, Li discloses that 
fast camera motion causes blurring and it is better to avoid those frames and to use 
frames outside a sequence of fast camera motion). 
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Regarding claims 5 and 21 , Li discloses wherein the frame analyzers include a 
human face detector ("Four factors are considered in determining the importance of a 
frame 26 according to one embodiment of the invention: 1) 2) the number of 
detected human faces in the frame Li, paragraph 52). 

Regarding claim 7, Li discloses wherein the step of selecting one of the key- 
frames from each cluster includes the step of determining an importance score for each 
candidate key-frame ("an importance value is assigned to each scene, shot and frame", 
Li, paragraph 8). 

Regarding claim 8, Li discloses wherein the step of determining an importance 
score for each candidate key-frame includes the step of determining an importance 
score in response to the meaningful content in each candidate key-frame ("FIG. 1c is a 
flowchart illustrating one embodiment of the computation of importance values 
according to the invention", Li, paragraph 11, and figure 1c) 

Regarding claim 9, Li discloses wherein the step of selecting one of the key- 
frames from each cluster includes the step of selecting one of the key-frames in 
response to an image quality of each candidate key-frame ("The last two factors are 
used to make sure that the extracted keyframe is a well-focused clear image and not a 
blurry image", Li, paragraph 52). 
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Regarding claim 10, Li discloses further comprising the step of selecting multiple 
key-frames from each cluster and obtaining a user selection for the multiple key-frames 
("if the user wants a detailed summary of certain scenes 22 or shots 24 while only 
wanting a brief review of others, the invention as described herein can easily achieve 
this by using a predefined but tunable scale factor", Li paragraph 73, the system selects 
an initial keyframe set and the user can control how many to select based on a tunable 
scaling factor). 

Regarding claims 1 1 and 16, Li discloses wherein the analyses include an 
accumulative color histogram difference comparison of the video frames ("To quantify 
the activity level for a scene 22, the frame-to-frame color histogram difference for each 
consecutive frame pair within the scene is computed and their average is used as the 
scene's activity level indicator", Li, Paragraph 34). 



Regarding claim 18, Li discloses wherein the frame analyzers include a fast 
camera motion detector ("The last two factors are used to make sure that the extracted 
keyframe is a well-focused clear image and not a blurry image, such as is caused by a 
fast camera motion", Li paragraph 52). 
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Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 3, 6, 13, 15, 20 and 22 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Li et al. (US 2003/0210886) in combination with Chang et al (US 
2004/0125877). 

Regarding claims 3 and 20, Li discloses all the elements of claims 1 and 14 
above. Li does not explicitly teach selecting a set of candidate key-frames in response 
to an object motion in the video. 

Chang, working in the same field of endeavor of video analysis, does teach 
selecting a set of candidate key-frames in response to an object motion in the video ("a 
Visual Feature Extraction Module 340 extracts visual features that can be used for view 
recognition or event detection. Examples of visual features include camera motions, 
object motions, color, edge, etc.", Chang, paragraph 104). 



It would have been obvious at the time the invention was made for one of 
ordinary skill in the art to add the object motion detection feature of Chang to the video 
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analysis system of Li in order to aid in detecting scene changes, "Since there are many 
different changes in video (e.g. object motion, lighting change and camera motion), it is 
a nontrivial task to detect scene changes", (Chang, paragraph 77). 

Regarding claims 6 and 1 5, Li does not disclose an audio event detector that 
selects a set of candidate key-frames by detecting a set of audio events in the video. 

Chang, working in the same field of endeavor of video analysis, does teach an 
audio event detector that selects a set of candidate key-frames by detecting a set of 
audio events in the video ("An Audio Feature Extraction module 345 extracts audio 
features that are used in later stages such as event detection", Chang, paragraph 105). 

It would have been obvious at the time the invention was made for one of 
ordinary skill in the art to add the Audio Feature Extraction Module feature of Chang to 
the video analysis system of Li in order to aid in detecting scene changes that have 
characteristic sound events associated with them ("more detailed analysis of motion 
such as speed, direction, repeating patterns in combination with audio analysis (e.g., 
hitting sound) may be needed.", Chang, paragraph 185). 

Regarding claims 1 3 and 22, Li does not disclose a user interface for displaying a 
set of video frames in the video previous to each key-frame and a set of video frames in 
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the video subsequent to each key-frame and for obtaining a user selection of one or 
more of the video frames. 

Chang, working in the same field of endeavor of video analysis, does teach a 
user interface for displaying a set of video frames in the video previous to each key- 
frame and a set of video frames in the video subsequent to each key-frame and for 
obtaining a user selection of one or more of the video frames ("there may be false 
alarms or misses associated with the indexing process. A browsing interface may be 
used for users to identify and correct false alarms. For errors of missing correct scene 
changes, users may use the interactive interface during real-time playback of video to 
add scene changes to the results", Chang, paragraph 100). 

It would have been obvious at the time the invention was made for one of 
ordinary skill in the art to add an interactive interface as taught by Chang to the video 
analysis system of Li so "If a user is monitoring the scene change detection process and 
notices a miss or false detection, he or she can hit a key or click mouse to insert or 
remove a scene change in real time." (Chang, paragraph 99). 



5. Claims 12 and 17 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Li et al. (US 2003/0210886) in combination with Dufaux (US 6,711,587 B1). 
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Regarding claims 12 and 17, Li discloses all the elements of claims 1 and 14. Li 
does not teach wherein the frame analyzers include a color layout analyzer. 

Dufaux, working in the same field of endeavor of keyframe selection does teach 
wherein the frame analyzers include a color layout analyzer ("Returning to FIG. 5, at 
step 502 a pixel-wise frame difference number is calculated for each frame. A measure 
of the amount of difference between pixels in successive frames may be used to 
determine a shot boundary in the digital video file", Dufaux, column 8, line 37, Dufaux's 
method computes color differences at specific x,y locations corresponding to a physical 
layout of the features). 

It would have been obvious at the time the invention was made for one of 
ordinary skill in the art to use the frame difference method of Dufaux with the video 
analysis system of Li to provide another method to test for shot boundaries since "A 
high value of pixel-wise frame difference indicates a possible shot boundary" (Dufaux, 
column 8, line 47). 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Thomas M. Redding whose telephone number is (571) 
270-1579. The examiner can normally be reached on Mon - Fri 7:30 am - 5:00 pm EST. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Vikkram Bali can be reached on (571 ) 272-7415. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/TMR/ 




PRIMARY EXAMINER 



